Accessibility of Tables in PDF Documents

نویسندگان

چکیده

People access and share information over the web in other digital environments, including libraries, form of documents such as books, articles, technical reports, etc. These are a variety formats, which Portable Document Format (PDF) is most widely used because its emphasis on preserving layout original material. The retrieval relevant material from these derivative challenging for (IR) rich semantic structure lost. important units images, figures, algorithms, mathematical formulas, tables becomes challenge. Among elements, particularly they can add value to resource description, discovery, accessibility not only but also libraries if made retrievable presentable readers. Sighted users comprehend sensemaking using visual cues, blind visually impaired must rely assistive technologies, text-to-speech screen readers, tables. However, technologies do pay sufficient attention order effectively present individuals. Therefore, ways be found make PDF comprehensible. Before developing solutions, it necessary review available tools, frameworks their capabilities, strengths, limitations comprehension perspective people, along with suitable environments like libraries. We no article that critically analytically presents evaluates technologies. To fill this gap literature, paper reports current state documents, comprehensible accessible people. study findings have implications sciences, retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anssi Nurminen Algorithmic Extraction of Data in Tables in Pdf Documents

TAMPERE UNIVERSITY OF TECHNOLOGY Degree Programme in Information Technology NURMINEN, ANSSI: Algorithmic Extraction of Data in Tables in PDF Documents Master of Science Thesis: 64 pages, 4 appendices (8 pages) April 2013 Majoring in: Embedded systems (software emphasis) Examiners: Prof. Tapio Elomaa, MSc. Teemu Heinimäki

متن کامل

Detecting Tables in HTML Documents

Table is a commonly used presentation scheme, especially for describing relational information. Table understanding on the web has many potential applications including web mining, knowledge management, and web content summarization and delivery to narrow-bandwidth devices. Although in HTML documents tables are generally marked as elements, often the tag is used liberally to ach...

متن کامل

Intelligent Wrapping from PDF Documents

Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. The semi-structured form of web pages, coupled with the availability of business-relevant data, has led to the availability of several established products on the market for wrapping data from the Web. One such approach is the Lixto me...

متن کامل

Hiding Malicious Content in PDF Documents

This paper is a proof-of-concept demonstration for a specific digital signatures vulnerability that shows the ineffectiveness of the WYSIWYS (What You See Is What You Sign) concept. The algorithm is fairly simple: the attacker generates a polymorphic file that has two different types of content (text, as a PDF document for example, and image: TIFF – two of the most widely used file formats). Wh...

متن کامل

Unifying Tables, Objects and Documents

This paper proposes a number of type-system and language extensions to natively support relational and hierarchical data within a statically typed object-oriented setting. In our approach SQL tables and XML documents become first class citizens that benefit from the full range of features available in a modern programming language like C ] or Java. This allows objects, tables and documents to b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information technology and libraries

سال: 2021

ISSN: ['0730-9295', '2163-5226']

DOI: https://doi.org/10.6017/ital.v40i3.12325